Pesquisa | Biblioteca Virtual em Saúde

Shallow Univariate ReLU Networks as Splines: Initialization, Loss Surface, Hessian, and Gradient Flow Dynamics.

Sahs, Justin; Pyle, Ryan; Damaraju, Aneel; Caro, Josue Ortega; Tavaslioglu, Onur; Lu, Andy; Anselmi, Fabio; Patel, Ankit B.

Front Artif Intell ; 5: 889981, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35647529

RESUMO

Understanding the learning dynamics and inductive bias of neural networks (NNs) is hindered by the opacity of the relationship between NN parameters and the function represented. Partially, this is due to symmetries inherent within the NN parameterization, allowing multiple different parameter settings to result in an identical output function, resulting in both an unclear relationship and redundant degrees of freedom. The NN parameterization is invariant under two symmetries: permutation of the neurons and a continuous family of transformations of the scale of weight and bias parameters. We propose taking a quotient with respect to the second symmetry group and reparametrizing ReLU NNs as continuous piecewise linear splines. Using this spline lens, we study learning dynamics in shallow univariate ReLU NNs, finding unexpected insights and explanations for several perplexing phenomena. We develop a surprisingly simple and transparent view of the structure of the loss surface, including its critical and fixed points, Hessian, and Hessian spectrum. We also show that standard weight initializations yield very flat initial functions, and that this flatness, together with overparametrization and the initial weight scale, is responsible for the strength and type of implicit regularization, consistent with previous work. Our implicit regularization results are complementary to recent work, showing that initialization scale critically controls implicit regularization via a kernel-based argument. Overall, removing the weight scale symmetry enables us to prove these results more simply and enables us to prove new results and gain new insights while offering a far more transparent and intuitive picture. Looking forward, our quotiented spline-based approach will extend naturally to the multivariate and deep settings, and alongside the kernel-based view, we believe it will play a foundational role in efforts to understand neural networks. Videos of learning dynamics using a spline-based visualization are available at http://shorturl.at/tFWZ2.

Domain-driven models yield better predictions at lower cost than reservoir computers in Lorenz systems.

Pyle, Ryan; Jovanovic, Nikola; Subramanian, Devika; Palem, Krishna V; Patel, Ankit B.

Philos Trans A Math Phys Eng Sci ; 379(2194): 20200246, 2021 Apr 05.

Artigo em Inglês | MEDLINE | ID: mdl-33583272

RESUMO

Recent advances in computing algorithms and hardware have rekindled interest in developing high-accuracy, low-cost surrogate models for simulating physical systems. The idea is to replace expensive numerical integration of complex coupled partial differential equations at fine time scales performed on supercomputers, with machine-learned surrogates that efficiently and accurately forecast future system states using data sampled from the underlying system. One particularly popular technique being explored within the weather and climate modelling community is the echo state network (ESN), an attractive alternative to other well-known deep learning architectures. Using the classical Lorenz 63 system, and the three tier multi-scale Lorenz 96 system (Thornes T, Duben P, Palmer T. 2017 Q. J. R. Meteorol. Soc. 143, 897-908. (doi:10.1002/qj.2974)) as benchmarks, we realize that previously studied state-of-the-art ESNs operate in two distinct regimes, corresponding to low and high spectral radius (LSR/HSR) for the sparse, randomly generated, reservoir recurrence matrix. Using knowledge of the mathematical structure of the Lorenz systems along with systematic ablation and hyperparameter sensitivity analyses, we show that state-of-the-art LSR-ESNs reduce to a polynomial regression model which we call Domain-Driven Regularized Regression (D2R2). Interestingly, D2R2 is a generalization of the well-known SINDy algorithm (Brunton SL, Proctor JL, Kutz JN. 2016 Proc. Natl Acad. Sci. USA 113, 3932-3937. (doi:10.1073/pnas.1517384113)). We also show experimentally that LSR-ESNs (Chattopadhyay A, Hassanzadeh P, Subramanian D. 2019 (http://arxiv.org/abs/1906.08829)) outperform HSR ESNs (Pathak J, Hunt B, Girvan M, Lu Z, Ott E. 2018 Phys. Rev. Lett. 120, 024102. (doi:10.1103/PhysRevLett.120.024102)) while D2R2 dominates both approaches. A significant goal in constructing surrogates is to cope with barriers to scaling in weather prediction and simulation of dynamical systems that are imposed by time and energy consumption in supercomputers. Inexact computing has emerged as a novel approach to helping with scaling. In this paper, we evaluate the performance of three models (LSR-ESN, HSR-ESN and D2R2) by varying the precision or word size of the computation as our inexactness-controlling parameter. For precisions of 64, 32 and 16 bits, we show that, surprisingly, the least expensive D2R2 method yields the most robust results and the greatest savings compared to ESNs. Specifically, D2R2 achieves 68 × in computational savings, with an additional 2 × if precision reductions are also employed, outperforming ESN variants by a large margin. This article is part of the theme issue 'Machine learning for weather and climate modelling'.

A Reservoir Computing Model of Reward-Modulated Motor Learning and Automaticity.

Pyle, Ryan; Rosenbaum, Robert.

Neural Comput ; 31(7): 1430-1461, 2019 07.

Artigo em Inglês | MEDLINE | ID: mdl-31113300

RESUMO

Reservoir computing is a biologically inspired class of learning algorithms in which the intrinsic dynamics of a recurrent neural network are mined to produce target time series. Most existing reservoir computing algorithms rely on fully supervised learning rules, which require access to an exact copy of the target response, greatly reducing the utility of the system. Reinforcement learning rules have been developed for reservoir computing, but we find that they fail to converge on complex motor tasks. Current theories of biological motor learning pose that early learning is controlled by dopamine-modulated plasticity in the basal ganglia that trains parallel cortical pathways through unsupervised plasticity as a motor task becomes well learned. We developed a novel learning algorithm for reservoir computing that models the interaction between reinforcement and unsupervised learning observed in experiments. This novel learning algorithm converges on simulated motor tasks on which previous reservoir computing algorithms fail and reproduces experimental findings that relate Parkinson's disease and its treatments to motor learning. Hence, incorporating biological theories of motor learning improves the effectiveness and biological relevance of reservoir computing models.

Assuntos

Simulação por Computador , Rede Nervosa/fisiologia , Redes Neurais de Computação , Recompensa , Humanos , Modelos Neurológicos , Plasticidade Neuronal/fisiologia , Neurônios/fisiologia , Reforço Psicológico

Circuit Models of Low-Dimensional Shared Variability in Cortical Networks.

Huang, Chengcheng; Ruff, Douglas A; Pyle, Ryan; Rosenbaum, Robert; Cohen, Marlene R; Doiron, Brent.

Neuron ; 101(2): 337-348.e4, 2019 01 16.

Artigo em Inglês | MEDLINE | ID: mdl-30581012

RESUMO

Trial-to-trial variability is a reflection of the circuitry and cellular physiology that make up a neuronal network. A pervasive yet puzzling feature of cortical circuits is that despite their complex wiring, population-wide shared spiking variability is low dimensional. Previous model cortical networks cannot explain this global variability, and rather assume it is from external sources. We show that if the spatial and temporal scales of inhibitory coupling match known physiology, networks of model spiking neurons internally generate low-dimensional shared variability that captures population activity recorded in vivo. Shifting spatial attention into the receptive field of visual neurons has been shown to differentially modulate shared variability within and between brain areas. A top-down modulation of inhibitory neurons in our network provides a parsimonious mechanism for this attentional modulation. Our work provides a critical link between observed cortical circuit structure and realistic shared neuronal variability and its modulation.

Assuntos

Atenção/fisiologia , Modelos Neurológicos , Rede Nervosa/fisiologia , Neurônios/fisiologia , Córtex Visual/citologia , Potenciais de Ação/fisiologia , Animais , Simulação por Computador , Análise Fatorial , Humanos , Inibição Neural/fisiologia , Estimulação Luminosa

Spatiotemporal Dynamics and Reliable Computations in Recurrent Spiking Neural Networks.

Pyle, Ryan; Rosenbaum, Robert.

Phys Rev Lett ; 118(1): 018103, 2017 Jan 06.

Artigo em Inglês | MEDLINE | ID: mdl-28106418

RESUMO

Randomly connected networks of excitatory and inhibitory spiking neurons provide a parsimonious model of neural variability, but are notoriously unreliable for performing computations. We show that this difficulty is overcome by incorporating the well-documented dependence of connection probability on distance. Spatially extended spiking networks exhibit symmetry-breaking bifurcations and generate spatiotemporal patterns that can be trained to perform dynamical computations under a reservoir computing framework.

Assuntos

Potenciais de Ação , Simulação por Computador , Neurônios/fisiologia , Algoritmos , Modelos Neurológicos

Highly connected neurons spike less frequently in balanced networks.

Pyle, Ryan; Rosenbaum, Robert.

Phys Rev E ; 93: 040302, 2016 04.

Artigo em Inglês | MEDLINE | ID: mdl-27176240

RESUMO

Biological neuronal networks exhibit highly variable spiking activity. Balanced networks offer a parsimonious model of this variability in which strong excitatory synaptic inputs are canceled by strong inhibitory inputs on average, and irregular spiking activity is driven by fluctuating synaptic currents. Most previous studies of balanced networks assume a homogeneous or distance-dependent connectivity structure, but connectivity in biological cortical networks is more intricate. We use a heterogeneous mean-field theory of balanced networks to show that heterogeneous in-degrees can break balance. Moreover, heterogeneous architectures that achieve balance promote lower firing rates in neurons with larger in-degrees, consistent with some recent experimental observations.

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA